103 research outputs found
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks
In this paper we propose and investigate a novel nonlinear unit, called
unit, for deep neural networks. The proposed unit receives signals from
several projections of a subset of units in the layer below and computes a
normalized norm. We notice two interesting interpretations of the
unit. First, the proposed unit can be understood as a generalization of a
number of conventional pooling operators such as average, root-mean-square and
max pooling widely used in, for instance, convolutional neural networks (CNN),
HMAX models and neocognitrons. Furthermore, the unit is, to a certain
degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013)
which achieved the state-of-the-art object recognition results on a number of
benchmark datasets. Secondly, we provide a geometrical interpretation of the
activation function based on which we argue that the unit is more
efficient at representing complex, nonlinear separating boundaries. Each
unit defines a superelliptic boundary, with its exact shape defined by the
order . We claim that this makes it possible to model arbitrarily shaped,
curved boundaries more efficiently by combining a few units of different
orders. This insight justifies the need for learning different orders for each
unit in the model. We empirically evaluate the proposed units on a number
of datasets and show that multilayer perceptrons (MLP) consisting of the
units achieve the state-of-the-art results on a number of benchmark datasets.
Furthermore, we evaluate the proposed unit on the recently proposed deep
recurrent neural networks (RNN).Comment: ECML/PKDD 201
Hardening against adversarial examples with the smooth gradient method
Commonly used methods in deep learning do not utilise transformations of the residual gradient available at the inputs to update the representation in the dataset. It has been shown that this residual gradient, which can be interpreted as the first-order gradient of the input sensitivity at a particular point, may be used to improve generalisation in feed-forward neural networks, including fully connected and convolutional layers. We explore how these input gradients are related to input perturbations used to generate adversarial examples and how the networks that are trained with this technique are more robust to attacks generated with the fast gradient sign method
Transparency and Trust in Human-AI-Interaction: The Role of Model-Agnostic Explanations in Computer Vision-Based Decision Support
Computer Vision, and hence Artificial Intelligence-based extraction of
information from images, has increasingly received attention over the last
years, for instance in medical diagnostics. While the algorithms' complexity is
a reason for their increased performance, it also leads to the "black box"
problem, consequently decreasing trust towards AI. In this regard, "Explainable
Artificial Intelligence" (XAI) allows to open that black box and to improve the
degree of AI transparency. In this paper, we first discuss the theoretical
impact of explainability on trust towards AI, followed by showcasing how the
usage of XAI in a health-related setting can look like. More specifically, we
show how XAI can be applied to understand why Computer Vision, based on deep
learning, did or did not detect a disease (malaria) on image data (thin blood
smear slide images). Furthermore, we investigate, how XAI can be used to
compare the detection strategy of two different deep learning models often used
for Computer Vision: Convolutional Neural Network and Multi-Layer Perceptron.
Our empirical results show that i) the AI sometimes used questionable or
irrelevant data features of an image to detect malaria (even if correctly
predicted), and ii) that there may be significant discrepancies in how
different deep learning models explain the same prediction. Our theoretical
discussion highlights that XAI can support trust in Computer Vision systems,
and AI systems in general, especially through an increased understandability
and predictability
Deep Learning of Representations: Looking Forward
Deep learning research aims at discovering learning algorithms that discover
multiple levels of distributed representations, with higher levels representing
more abstract concepts. Although the study of deep learning has already led to
impressive theoretical results, learning algorithms and breakthrough
experiments, several challenges lie ahead. This paper proposes to examine some
of these challenges, centering on the questions of scaling deep learning
algorithms to much larger models and datasets, reducing optimization
difficulties due to ill-conditioning or local minima, designing more efficient
and powerful inference and sampling procedures, and learning to disentangle the
factors of variation underlying the observed data. It also proposes a few
forward-looking research directions aimed at overcoming these challenges
Top-Down Feedback in an HMAX-Like Cortical Model of Object Perception Based on Hierarchical Bayesian Networks and Belief Propagation
PubMed ID: 2313976
A multi-biometric iris recognition system based on a deep learning approach
YesMultimodal biometric systems have been widely
applied in many real-world applications due to its ability to
deal with a number of significant limitations of unimodal
biometric systems, including sensitivity to noise, population
coverage, intra-class variability, non-universality, and
vulnerability to spoofing. In this paper, an efficient and
real-time multimodal biometric system is proposed based
on building deep learning representations for images of
both the right and left irises of a person, and fusing the
results obtained using a ranking-level fusion method. The
trained deep learning system proposed is called IrisConvNet
whose architecture is based on a combination of Convolutional
Neural Network (CNN) and Softmax classifier to
extract discriminative features from the input image without
any domain knowledge where the input image represents
the localized iris region and then classify it into one of N
classes. In this work, a discriminative CNN training scheme
based on a combination of back-propagation algorithm and
mini-batch AdaGrad optimization method is proposed for
weights updating and learning rate adaptation, respectively.
In addition, other training strategies (e.g., dropout method,
data augmentation) are also proposed in order to evaluate
different CNN architectures. The performance of the proposed
system is tested on three public datasets collected
under different conditions: SDUMLA-HMT, CASIA-Iris-
V3 Interval and IITD iris databases. The results obtained
from the proposed system outperform other state-of-the-art
of approaches (e.g., Wavelet transform, Scattering transform,
Local Binary Pattern and PCA) by achieving a Rank-1 identification rate of 100% on all the employed databases
and a recognition time less than one second per person
- …